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Abstract 

The rate at which dependencies between future and past observations decay in a random 
process may be quantified in terms of mixing coefficients. The latter in turn appear in strong 
laws of large numbers and concentration of measure results for dependent random variables. 
Questions regarding what rates are possible for various notions of mixing have been posed since 
the 1960's, and have important implications for some open problems in the theory of strong 
mixing conditions. 

This paper deals with rj- mixing, a notion defined in [Kontorovich and Ramanan], which is 
closely related to </>-mixing. We show that there exist measures on finite sequences with essentially 
arbitrary 77-mixing coefficients, as well as processes with arbitrarily slow mixing rates. 

1 Introduction 

1.1 Preliminaries 

Strong mixing conditions deal with quantifying the decaying dependence between blocks of random 
variables in a stochastic process. These have been traditionally used to establish strong laws of large 
numbers for non-independent processes. Bradley [4, 5, 6] is an encyclopedic source on the matter; 
see also his survey paper [3]. In [6, Chapter 26 ], Bradley traces the early research on mixing rates 
to Volkonskh and Rozanov [19] and gives a comprehensive account of the progress since then. 

Our interest in strong mixing was motivated by the desire for concentration of measure bounds 
for non-independent random sequences. Given the excellent survey papers and monographs dealing 
with concentration of measure (in particular, [14], [15], and [18]), we will give only the briefest 
summary here. 

Suppose £1 is a finite 1 set and let [i be an arbitrary (nonproduct) probability measure on f2 n . 
We proceed to define a type of strong mixing used throughout this note. For 1 < % < j < n and 
x G tt\ let 

C(X? \X{ = x) 
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be the distribution of X™ = (Xj, . . . ,X n ) conditioned on X\ = x. For y € ft 1 1 and w,w' G $7, 
define 

Vv(y,w,w') = \\£(X?\X{=yw)-C(X?\Xi = yw')\\ TV , (1) 
where ||-|| TV = \ \\-\\i is the total variation norm; likewise, define 

fjij = max r]ij(y,w,w'). (2) 

yen* ,«J,u;'€f2 

This notion of mixing is by no means new; it can be traced (at least implicitly) to Marton's 
work [16] and is quite explicit in Samson [17] and Chazottes et al. [7]. We are not aware of a 
standardized term for this type of mixing, and have referred to it as r)-mixing in previous work [13]. 
It was observed in [17] that the ^-mixing coefficients bound the ^-mixing ones: 



r}ij < 24>j-i, 



and conjectured in [11] that 



^ n— 1 

— > d>j < 1 + max 

2^" - Ki<n 

i=l 



2 *fo 

.7=1+1 



the latter remains open. 

In all instances, ^-mixing has come up in the context of concentration of measure. In particular, 
define T and A to be upper-triangular n x n matrices, with Ya = An = 1 and 

Fij = \ff\ii-, Ay = f)ij (3) 

for 1 < i < j < n. 

Samson [17] proved that any distribution ^ on [0, l] n and any convex / : [0, l] n — > R with 
||/|| Lip < 1 (with respect to £2) satisfy 

M{|/-M/l>*} < 2ex p("^p) W 

where ||r|| 2 is the £2 operator norm. 

Chazottes et al. [7] and independently, the author with K. Ramanan [13] showed that any 
distribution /i on £l n and any / : f2 n — > R with ||/|| Li < n -1 / 2 (with respect to the Hamming 
metric) satisfy 

»{\f-»f\>t} < 2exp^-^^^ (5) 

where is the £oo operator norm (|| may be replaced by ||A|| 2 and [7] achieves a better 

constant in the exponent). 

Results of type (4) and (5) are known as concentration of measure inequalities; broadly, they 
assert that any "sufficiently continuous" function is tightly concentrated about its mean. Such 
bounds have a remarkable range of applications, spanning abstract fields such as asymptotic Banach 
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space theory [1, 18] as well as more practical ones such as randomized algorithms [8] and machine 
learning [2]. Strong laws of large numbers are readily obtained from concentration bounds [12]. 

Having motivated the study of mixing and measure concentration, let us turn to the behavior 
of the ^-mixing coefficients. It is immediate from the construction that fjij is an upper-triangular 
n x n matrix satisfying 

(PI) fjij = for i > j 

(P2) < fjij < 1 for 1 < i < j < n. 

It is also simple to show (as we shall do below in Lemma 2.1) that 
(P3) fjij 2 < fjij-, for i < ji < j 2 - 

1.2 Main results 

A natural question (first posed in [11]) is whether the conditions (P1)-(P3) completely characterize 
the possible (fjij) matrices, or if there are some other constraints that the ^-mixing coefficients must 
satisfy. The main technical result of this note is Theorem 2.7, which resolves this question in the 
affirmative. Thus, for any "valid" (i.e., satisfying (P1)-(P3)) n x n matrix H = (hij), there is a 
finite set fl and a probability measure ji on $7™ such that %(/u) = for 1 < i < j < n. 

More broadly, it is of interest to characterize the possible mixing rates that various processes 
may have. Chapter 26 of [6] deals with this question and gives several intricate constructions of 
random processes having prescribed mixing rates, under various types of strong mixing. Following 
the work of Kesten and O'Brien [10], it emerged that essentially arbitrary mixing rates are possible 
for various mixing notions. Thus it is not surprising that the same holds true for 77-mixing; this is 
an easy consequence of our main result (Corollary 2.9). 

Along the way, we collect various other observations regarding the 77-mixing coefficients - some 
of which are auxiliary in proving our main results, and others may be of independent interest. 

1.3 Notation 

We use the indicator variable to assign 0-1 truth values to the predicate in {•}. 

Random variables are capitalized (X), specified sequences are written in lowercase (x G Sl n ), 
the shorthand X? = (Aj, . . . ,Xj) is used for all sequences, and sequence concatenation is denoted 
multiplicatively: x\x k j +l = x\. Sums will range over the entire space of the summation variable; 
thus fjXj) stands for 
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(the factor of 1/2 is not entirely standard). Unless otherwise stated, 0, is a finite set. Whenever 
we wish to be explicit about the dependence of r\ij and fjij on a given measure fj,, we will write 
r)ij(fj,;y,w,w') and fjij(fj,), respectively. 



2 Constructions and proofs 

Let us begin with an easy verification that (P3) holds for all (fjij): 

Lemma 2.1. Let (fjij)\<i<j< n , be the n-mixing matrix associated with a probability measure fx on 
Q n . Then, for all 1 < i < j\ < < n, we have 

fjij2 — Viji • 

Proof. Fix l<i<ji<j2<n and y G f^ -1 , w, w' G Then 

rnj 2 (y,w,w') = \ ^2 \v(x\yw) - p(x \ yw') \ 

= ^ ^2 \ \p-(ux\yw) - n(ux\yw')}\ 

- \ Yj \^(ux\yw) - fi(ux\yw')\ 

* 3\ 



\ ^2 \n(z\yw) - K z \yw')\ 
Vih(y,w,w'). 



□ 



Next, we establish a simple continuity property of r\ij\ 

Lemma 2.2. Suppose Q is a finite set and let Vl(^l) be the set of all strictly positive probability 
measures fx on Q n (i.e., fj,(x) > for all x G Q n ). Endow with the metric ||-|| TV . Then, for 

all 1 < i < j < n, the functional fjij : — ► R is continuous with respect to ||-|| TV . 

Proof. The continuity of rfij(y,w,w') : fi i-> R for fixed y G fi* -1 , to, w/ G O follows immediately 
from Lemma 5.4.1 of [11]. The claim follows since continuity is preserved under finite maxima. □ 

Remark 2.3. Continuity breaks down on the boundary of see Section 5.4 of [11] for an 

example. 

Our construction of a measure with the desired mixing coefficients will proceed in stages, the final 
object being composed of intermediate ones. The building blocks will be measures of a particular 
simple form. For 1 < k < n, let h G R^ +1 be a vector of length n — k, satisfying 

< hj+i < h j < 1 

for k < j < n; any such h will be called a valid k th row. We say that the measure \i on J7 n is pure 
k th row (with respect to h) if its ^-mixing matrix (fjij)i<i<j< n satisfies 

fjij = l{ i= k}hj. 

Our first technical result is the existence of arbitrary pure k th row measures: 
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Lemma 2.4. Fix 1 < k < n and suppose h G is a valid k th row vector. Then there exists a 

measure on {0, l} n which is pure k th row with respect to h. 

Proof. The proof will proceed by algorithmic construction. Let a valid k th row vector h G be 
given. Initialize ^ ra+1 ) to be the uniform measure: 

^ n+1 \ x ) = 2~ n , xG{0,l} n . 

For v G [0, 1], define the measure on {0, l} n by 

where ct n {v) is the normalization constant ensuring that S^Mfi) = 1, and define f n : [0, 1] -» 
[0,1] by 

Lemma 2.2 assures the continuity of f n and it is straightforward to verify that / n (0) = / n (l) = 1 
and / n (l/2) = 0. Thus, there exists a t> * G [0, 1] such that f n (v*) = h n ; define the new measure 
M (n) by 

^ n \x) = fjS n ' v *\x). (6) 

Similarly, for v G [0, 1], define 

^-l,v) {x) = an _ 1 ( u )[ v l { ^ n _ i} ^(n)( a ;) + (l_ t; ) 1{ ^ B _ i}M (n) (a . )]) XG { ,1}» 

(where a n _i(i;) is again the appropriate normalization constant) and define f n -i ■ [0, 1] — > [0, 1] by 

/n-l(«) = ^.n-i^^- 1 ^). 

Again, it is easily seen that / n _i(0) = / n _i(l) = 1 and /„(l/2) = so by continuity there is a 
f* G [0, 1] for which f n -i(v*) = h n -\\ so we may define the new measure 

^ n ~ l \x) = ^-^(x). (7) 

By construction, we have f)k, n -i{^ n ~ 1 ^) = h n -i; we claim that additionally, 

%,n(M (n_1) ) = &n (8) 

(in other words, the second modification in (7) did not "ruin" the effects of the first modification 
in (6)). The claim in (8) holds because in fact for all y G {0, l} fe and x G {0, 1}, we have 

l j,W{x n = x\X* = y} = v<. n -V[x n = x\X* = y}; (9) 

the latter fact is straightforward (though somewhat tedious) to verify. 

We may now proceed by induction. Let be defined, for k + 1 < t < n. Define, for v G [0, 1], 

^-^){x) = « t _ 1 ( V )[ U l {:Cfc=:ct _ l} ^)(x) + (l-x;)l {:Cfc ^ t _ l}M (*)(a ; )], XG{0,1}" 
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and let / t _i : [0, 1] -> [0, 1] be 

Choose v* G [0, 1] so that ft-i(v*) = ht-i and define the new measure 

M (t-i) = M (t-i,t,'). 

Again, a straightforward calculation gives 

M W{^ = x|X 1 fc = j/} = M ( t - 1 ){x t " = x|Xf = y} (10) 

for all y G {0, l} fc and all x G {0, l} n ~ t+1 , which ensures that 

all have the right values. The process terminates when we have constructed fi^ k+1 ^; this is our 
desired pure k th row measure with respect to h. It remains to verify that fjij(l~i^ k+1 ^) = for i ^ k, 
but this is almost immediate. □ 

Remark 2.5. The "backwards" order of constructing the measures with t = n,n — 1, . . . , fe + 1 
is essential. A construction in the "forward" order fails precisely because (10) no longer holds. The 
reader is invited to verify that the marginals of the constructed measure n = ^ k+l "> are identical, 
with n {Xi = 0} = n {Xi = 1} = 1/2 for 1 < i < n. 

Next we turn to product measures. There are (at least) two natural ways to form products 
of probability measures; we shall refer to them as series and parallel. Let X , y be finite sets and 
m, n G N. If /x is a measure on X m and v a measure on X n , we define their series product, denoted 
by jJL © v, to be the following measure on X m+n : 

{^l@u)(z) = n{x)v(y), z = xy G X m+n ,x G X m ,y G X n . (11) 

If fj, is a measure on X n and v a measure on y n , we define their parallel product, denoted by fx (8) v, 
to be the following measure on (X x y) n : 

{H®v)(z) = n{x)v{y), z = (x,y) G {X x y) n . 

As our main construction will involve parallel products of measures, the following simple result 
is useful. 

Lemma 2.6. Let fx and v be probability measures on X n and y n , respectively, and let fjij(fi), fjij(u) 
and fjij(n (8> v) be the corresponding n-mixing matrices. Then we have 

max {%(/i),%(^)} < mj{n®v) < fjij(ii) +fjij{v) (12) 

for all 1 < i < j < n. 
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Proof. Fix i < j. Throughout this proof, x will denote sequences over X, y sequences over y, and 
z = (x,y) over X x y. Pick arbitrary z 1 ^ 1 = (x^ 1 ,y\ _1 ) and Zi = (xi,yi), z\ = (x'^y^). Then we 
expand 



mj (n®v-z\ 1 ,z i ,4) = \\(n®i')(-\z[ 1 Zi)-(ji®u)(-\4 1 Zi)\\ TV (13) 

= \ E I (a* ® I z i~^) - (m ® !/)(*? 1 



= 5 E E K*? i si^M*/? i yt^i) - i ViMi/? i s/i -1 ^) I 



* ^E 



= ^El^i^r^-M^i^r 1 ^)! 



rjij^^Xi ,Xi,Xij 



Exchanging the roles of x and y yields the lower bound in (12). To obtain the upper bound, we 
apply the ||-|| TV tensorization property (see Lemma 2.2.5 in [11]) to (13): 

\\^®y){-\z\- 1 z i )-{ l ji® v ){-\z[- l z' i )\\ TY < 

\x\~\i) - M- br^)|| TV + H U-'vi) - "(■|yi" 1 ^)|| TV - 

| x\-^ Xi ) - n(- 1 zT^Wtv IK i ^) - K- 1 yj-^lL 

which yields the desired bound. □ 

The interested reader may consult Lemma 3.2.1 of [11] for some observations regarding the 
behavior of ^-mixing coefficients under series products. 
We are now ready to prove the main result of this note. 

Theorem 2.7. Let H = (hij) be any n x n matrix satisfying (PI), (P2) and (PS). Then there 
exists a finite set VL and a probability measure fx on Q n such that 

fjij(n) = h i:j (14) 

for 1 < i < j < n. 

Proof. For k = 1, ...,n — 1, let h^ G RJJ+i De the vector (hk,k+i, hk,k+2, ■ ■ ■ , hk, n ) ~ i-e., the 
nonzero entries of the k th row of H. Then Lemma 2.4 provides a measure on {0, l} n which is 
pure k th row with respect to h^ k \ Let ji be the (parallel) product of these pure k th row measures: 

note that n is a measure on n n , where fi = {0, 1} . By definition of pure k row measures and 
by Lemma 2.6, we have that (14) holds. □ 
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Remark 2.8. Our construction requires an exponential state space, |0| = 2 n ~ 1 . Are there analogous 
constructions using fewer states? In Section 5.7 of [11] we constructed a measure \i on {0, l} n 
satisfying (14) for the special case where the rows of H are constant: h iyi+ i = h i)i+2 = . . . = h iyTl ; it 
seems unlikely that the general case is achievable with a constant number of states. 

Up to this point, we have been discussing the ^-mixing coefficients of probability measures 
on finite sequences. This notion extends quite naturally to random processes - i.e., probability 
measures [i on Let /j, n be the marginal distribution of X± and denote by fj^ the ^-mixing 

matrix of fi n . It is straightforward to verify that in general, ry- depends on n and that 

_(n) . -(n+1) 

% < Vij 

for 1 < i < j < n. Let A n (//) be the n x n matrix A corresponding to fi n , as defined in (3). Recall 
that the £oo operator norm of a nonnegative matrix is its maximal row sum. Thus we can define 
the ry-mixing rate of the process \i as the function i? M : N — > E: 

R^n) = WAMW^. 

It's clear that (i) R^ is nondecreasing and (ii) 1 < R^(n) < n; any function satisfying these 
properties will be called a valid rate function. 

Corollary 2.9. Let r : N —> N be a valid rate function. Then there is a set Q = {0, 1} N and a 



measure fi on S7 N such that 



l imsup M^ = i. (15) 



Proof. We begin with the simple observation that if r is a valid rate function then for all k > 1 and 
all < e < 1, there is an n = n(k, e) > k and an h = h(k, e) G [0, 1] such that 

1-e < *(fc,e)(n-fc) < 1. (16) 
r(n) 

Let 1 > £\ > £2 > • • • > be a sequence decreasing to 0. Pick a k > 1 and let n(k) = n(k, £&) and 
h(k) = h(k,e k ), as stipulated in (16). Define h^ £ E^ +1 by 

hf ] = h(k), k<j< n(k), 

and let be the measure on {0, l} n(fc) which is pure k th row with respect to , as constructed 
in Lemma 2.4. Let (3 be the symmetric Bernoulli measure on {0, 1} (i.e., (3(0) = (3(1) = 1/2) and 
define the measure jl^ on {0, 1} N by 

fjik) = ^) e/ 3© /?©... 

where the operation © is defined in (11). In this way, we have obtained a countable collection of 
measures {/t^ : k = 1, 2, . . .} on {0, 1} N ; note that by construction, we have for each k 

||A„ |t) (AW)|L L 
r(n(fc)J 
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Now let (i be the measure on ({0, 1} N ) N obtained by taking the (parallel) product of all the fi^'s: 

H = <g> jx {2) ® ... 

(the <S> operator is defined in (12)). It remains to verify that li is a well-defined probability measure 
on Q N , f2 = {0, 1} N by applying the Ionescu Tulcea theorem ([9, Theorem 6.17]), and that (17) 
continues to hold when fi^ is replaced with [i - the latter is straightforward. 2 □ 

Remark 2.10. Our construction required an uncountable state space, Q = {0, 1} N . Are analogous 
constructions possible with smaller f2? Is there a construction achieving (15) with lim in place of 
lim sup? 
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